Search results for "Variable selection"

showing 10 items of 24 documents

Evaluation of the effect of chance correlations on variable selection using Partial Least Squares -Discriminant Analysis

2013

Variable subset selection is often mandatory in high throughput metabolomics and proteomics. However, depending on the variable to sample ratio there is a significant susceptibility of variable selection towards chance correlations. The evaluation of the predictive capabilities of PLSDA models estimated by cross-validation after feature selection provides overly optimistic results if the selection is performed on the entire set and no external validation set is available. In this work, a simulation of the statistical null hypothesis is proposed to test whether the discrimination capability of a PLSDA model after variable selection estimated by cross-validation is statistically higher than t…

Variable selectionESTADISTICA E INVESTIGACION OPERATIVAFeature selectionChance correlationsAnalytical ChemistrySet (abstract data type)ResamplingPartial least squares regressionStatisticsHumansMetabolomicsLeast-Squares AnalysisSelection (genetic algorithm)ProbabilityGaucher DiseaseModels StatisticalChemistryDiscriminant AnalysisReproducibility of ResultsPartial Least Squares-Discriminant Analysis (PLSDA)Linear discriminant analysisVariable (computer science)Null hypothesisAlgorithmsSoftware
researchProduct

Variable selection in the analysis of energy consumption-growth nexus

2015

There is abundant empirical literature that focuses on whether energy consumption is a critical driver of economic growth. The evolution of this literature has largely consisted of attempts to solve the problems and answer the criticisms arising from earlier studies. One of the most common criticisms is that previous work concentrates on the bivariate relationship, energy consumption–economic growth. Many authors try to overcome this critique using control variables. However, the choice of these variables has been ad hoc, made according to the subjective economic rationale of the authors. Our contribution to this literature is to apply a robust probabilistic model to select the explanatory …

Economics and EconometricsControl variablesVariable selectionEnergy (esotericism)Probabilistic modelControl variableStatistical modelBivariate analysisEnergy consumptionCausalityEnergy consumptionCausalityGeneral EnergyEnergy intensityEconometricsEconomicsNexus (standard)Economic growth
researchProduct

Induced smoothing in LASSO regression

The thesis is being carried out with the National research Council at the Institute of Biomedicine and Molecular Immunology "Alberto Monroy" of Palermo, where I am a fellow, under the supervision of MD Stefania La Grutta. Our research unit is focused on clinical research in allergic respiratory problems in children. In particular, we are interested in to assess the determinants of impaired lung function in a sample of outpatient asthmatic children aged between 5 and 17 years enrolled from 2011 to 2017. Our dataset is composed by n = 529 children and several covariates regarding host and environmental factors. This thesis focuses on hypothesis testing in lasso regression, when one is interes…

LASSO regression; Induced smoothing; Sandwich formula; Sparse models; Variable selection.Sparse modelVariable selection.Induced smoothingSandwich formulaSettore SECS-S/01 - StatisticaLASSO regression
researchProduct

Variable selection with unbiased estimation: the CDF penalty

2022

We propose a new SCAD-type penalty in general regression models. The new penalty can be considered a competitor of the LASSO, SCAD or MCP penalties, as it guarantees sparse variable selection, i.e., null regression coefficient estimates, while attenuating bias for the non-null estimates. In this work, the method is discussed, and some comparisons are presented.

Variable selection L1-type penalty LASSO SCAD MCP
researchProduct

Geographic mosaic of selection by avian predators on hindwing warning colour in a polymorphic aposematic moth

2020

AbstractWarning signals are predicted to develop signal monomorphism via positive frequency-dependent selection (+FDS) albeit many aposematic systems exhibit signal polymorphism. To understand this mismatch, we conducted a large-scale predation experiment in four locations, among which the frequencies of hindwing warning coloration of aposematic Arctia plantaginis differ. Here we show that selection by avian predators on warning colour is predicted by local morph frequency and predator community composition. We found +FDS to be strongest in monomorphic Scotland, and in contrast, lowest in polymorphic Finland, where different predators favour different male morphs. +FDS was also found in Geo…

0106 biological sciencespredatorspredator-prey interactionsFrequency-dependent selectionFREQUENCY-DEPENDENT SELECTIONDIVERSITYMoths01 natural sciencesMüllerian mimicrytäpläsiilikäsPredationmuuntelu (biologia)Arctia plantaginisPredatorFinland0303 health sciencesMonomorphismsaaliseläimetluonnonvalintaEcologywood tiger mothVARIABLE SELECTIONDIFFERENTIATIONPOISON FROG1181 Ecology evolutionary biologyMULLERIAN MIMICRYvaroitusväriColorZoologyAposematismBiology010603 evolutionary biologyBirds03 medical and health sciencesArctia plantaginisAposematismPARASEMIAcolour polymorphismpetoeläimetAnimalsaposematismfrequency‐dependent selectionEcology Evolution Behavior and SystematicsSelection (genetic algorithm)030304 developmental biologysignal variationsignal convergence010604 marine biology & hydrobiologypredator–prey interactionsEVOLUTIONSIGNALScotlandCommunity compositionPredatory Behavior
researchProduct

Model uncertainty and variable selection: an application to the modelization of FDI determinants in Europe

2019

Las últimas décadas han visto un interés cada vez mayor en la IED, y un debate creciente sobre su modelización en términos de las variables consideradas como sus determinantes, la especificación del modelo y los métodos de estimación del modelo de gravedad de la IED. Esto se debe a la incertidumbre que rodea tanto las teorías como los enfoques empíricos de la IED. Esta Tesis doctoral tiene como objetivo contribuir a la literatura mediante la investigación de las fuerzas impulsoras de las actividades de las EMNs hacia y desde los países europeos, tanto a nivel regional como nacional, abordando los problemas de selección de variables e incertidumbre del modelo que se enfrentan al modelizar la…

gravity model:CIENCIAS ECONÓMICAS::Econometría::Modelos econométricos [UNESCO]UNESCO::CIENCIAS ECONÓMICAS::Economía internacionalgeneralized linear modelsgermanyUNESCO::CIENCIAS ECONÓMICAS::Econometría::Modelos econométricosbayesian model averagingspanish regions:CIENCIAS ECONÓMICAS::Economía internacional::Inversión exterior [UNESCO]:CIENCIAS ECONÓMICAS::Economía internacional [UNESCO]UNESCO::CIENCIAS ECONÓMICAS::Economía internacional::Inversión exteriorforeign direct investment determinantsvariable selection
researchProduct

Differential geometric least angle regression: a differential geometric approach to sparse generalized linear models

2013

Summary Sparsity is an essential feature of many contemporary data problems. Remote sensing, various forms of automated screening and other high throughput measurement devices collect a large amount of information, typically about few independent statistical subjects or units. In certain cases it is reasonable to assume that the underlying process generating the data is itself sparse, in the sense that only a few of the measured variables are involved in the process. We propose an explicit method of monotonically decreasing sparsity for outcomes that can be modelled by an exponential family. In our approach we generalize the equiangular condition in a generalized linear model. Although the …

Statistics and ProbabilityGeneralized linear modelSparse modelMathematical optimizationGeneralized linear modelsVariable selectionPath following algorithmEquiangular polygonGeneralized linear modelLASSODANTZIG SELECTORsymbols.namesakeExponential familyLasso (statistics)Sparse modelsDifferential geometryInformation geometryCOORDINATE DESCENTFisher informationERRORMathematicsLeast-angle regressionLeast angle regressionGeneralized degrees of freedomsymbolsSHRINKAGEStatistics Probability and UncertaintySimple linear regressionInformation geometrySettore SECS-S/01 - StatisticaAlgorithmCovariance penalty theory
researchProduct

Scad-elastic net and the estimation of individual tourism expenditure determinants

2014

This paper introduces the use of scad-elastic net in the assessment of the determinants of individual tourist spending. This technique approaches two main estimation-related issues of primary importance. So far studies of tourism literature have made a wide use of classic regressions, whose results might be affected by multicollinearity. In addition, because of the absence of robust economic theory on tourism behavior, regressor selection is often left to researcher's choice when not driven by non-optimal automatic criteria. Scad-elastic net is an OLS model that accounts for both these problems by including two types of parameters constraints, namely the smoothly clipped absolute deviation …

EstimationElastic net regularizationInformation Systems and ManagementVariable selectionPenalized regressionbusiness.industryManagement Information SystemsCollinearityArts and Humanities (miscellaneous)MulticollinearityDevelopmental and Educational PsychologyEconometricsPer capitaEconomicsUruguayScad-elastic netTourism expenditureSettore SECS-S/01 - StatisticabusinessScadAccommodationPsychographicTourismInformation SystemsDecision Support Systems
researchProduct

Estimation of sparse generalized linear models: the dglars package

2013

dglars is a public available R package that implements the method proposed in Augugliaro, Mineo and Wit (2013) developed to study the sparse structure of a generalized linear model. This method, called dgLARS, is based on a differential geometrical extension of the least angle regression method (LARS). The core of the dglars package consists of two algorithms implemented in Fortran 90 to efficiently compute the solution curve; specifically a predictor-corrector algorithm and a cyclic coordinate descent algorithm.

generalized linear models dgLARS predictor-corrector algorithm cyclic coordinate descent algorithm sparse models variable selectionSettore SECS-S/01 - Statistica
researchProduct

Extended differential geometric LARS for high-dimensional GLMs with general dispersion parameter

2018

A large class of modeling and prediction problems involves outcomes that belong to an exponential family distribution. Generalized linear models (GLMs) are a standard way of dealing with such situations. Even in high-dimensional feature spaces GLMs can be extended to deal with such situations. Penalized inference approaches, such as the $$\ell _1$$ or SCAD, or extensions of least angle regression, such as dgLARS, have been proposed to deal with GLMs with high-dimensional feature spaces. Although the theory underlying these methods is in principle generic, the implementation has remained restricted to dispersion-free models, such as the Poisson and logistic regression models. The aim of this…

Statistics and ProbabilityGeneralized linear modelMathematical optimizationGeneralized linear modelsPredictor-€“corrector algorithmGeneralized linear model02 engineering and technologyPoisson distributionDANTZIG SELECTOR01 natural sciencesCross-validationHigh-dimensional inferenceTheoretical Computer Science010104 statistics & probabilitysymbols.namesakeExponential familyLEAST ANGLE REGRESSION0202 electrical engineering electronic engineering information engineeringApplied mathematicsStatistics::Methodology0101 mathematicsCROSS-VALIDATIONMathematicsLeast-angle regressionLinear model020206 networking & telecommunicationsProbability and statisticsVARIABLE SELECTIONEfficient estimatorPredictor-corrector algorithmComputational Theory and MathematicsDispersion paremeterLINEAR-MODELSsymbolsSHRINKAGEStatistics Probability and UncertaintySettore SECS-S/01 - StatisticaStatistics and Computing
researchProduct